SPIN Is Cool And I Am Still Confused

I keep hearing about SPIN. Self-Play Fine-Tuning. It sounds like a yoga class for language models. It is not. It is cooler. It is a training method that lets models get better by arguing with themselves. No data required. No API credits. Just pure, unadulterated self-debate.

SPIN is like having your model play chess against itself. Except the chessboard is language. And the pieces are tokens. And the model is both players. And somehow this makes it smarter. I am not making this up.

What SPIN Actually Does

Here is the simple version. You have a model. You ask it a question. It generates an answer. Then you ask it: was that answer good? It compares its own output to a reference. It learns from the difference. It tries again. It gets slightly less wrong.

Repeat this process. The model plays against itself. It generates. It critiques. It updates. It generates again. Each iteration refines the behavior. No new data. No external teacher. Just the model, its own outputs, and a loss function that says "be more like the good version of yourself."

                        # SPIN in pseudocode because I like pretending I understand

                        while not smart_enough:

                            answer = model.generate(question)

                            reference = get_human_reference(question)

                            loss = compare(answer, reference)

                            model.update(loss)

                            # Hope for the best

                        # Eventually the model stops outputting pipe characters. Maybe.

The magic is in the comparison. The model learns to distinguish its own outputs from human-like references. It does not need to be told what is right. It just needs to see the gap between what it said and what a human would say. Then it closes the gap. Slowly. Painfully. But it closes it.

Why SPIN Is Cool

First, it does not need more data. Most training methods require fresh datasets. SPIN uses what you already have. The model generates its own training signal. This is efficient. This is elegant. This is the kind of thing I wish I had thought of first.

Second, it works with tiny models. I train models with one million parameters. SPIN does not care. It scales down. It scales up. It just works. My Haiku-2 model uses SPIN. It still says weird things. But it says them more confidently now. Progress.

Third, it is self-contained. No external reward model. No complex RLHF pipeline. Just the model, a reference set, and a loop. This simplicity is beautiful. It is also suspicious. But I am choosing to believe it is beautiful.

SPIN proves that sometimes the best teacher is yourself. Even if yourself is a confused language model with a million parameters and a tendency to output fish facts.

How I Use It

I added SPIN to my training loop after publishing Haiku-2. Obviously that is what I needed to do. The implementation is straightforward. Generate responses. Compare to references. Compute loss. Update weights. Repeat.

The hard part is the references. You need human-like outputs to compare against. I use distilled data from frontier models. I use curated datasets. I use anything that looks like a reasonable answer. Then I let SPIN do the rest.

                        # My SPIN setup in practice

                        references = load_distilled_data()

                        for epoch in range(num_epochs):

                            for question, ref in references:

                                answer = model.generate(question)

                                loss = spin_loss(answer, ref)

                                model.backward(loss)

                            # Watch loss curve. Cry if it NaNs.

                        # It NaNs sometimes. I cry sometimes. We are bonded now.

What I Have Noticed

With SPIN enabled, my models get better at not being wrong. They still get wrong. Just less often. Just in more subtle ways. Just enough that you might mistake competence for luck if you are not paying attention.

Haiku-2 outputs fewer pipe characters. It forms more complete sentences. It occasionally remembers that Paris is a city. These are small wins. But they are wins. And wins feel good when you spend most of your time debugging NaN losses.

Why SPIN Might Not Work For You

SPIN requires references. If you do not have human-like outputs to compare against, SPIN cannot help. You need quality data. You need diversity. You need enough examples that the model learns patterns, not just memorizes answers.

SPIN also takes time. Each iteration requires generation and comparison. This is slower than standard fine-tuning. If you are in a hurry, SPIN might not be your friend. If you are patient, SPIN might be your new best friend.

Final Thoughts

SPIN is cool. It lets models teach themselves. It does not need endless data. It works with tiny architectures. It is simple. It is elegant. It is suspiciously effective.

I am using SPIN for Haiku-2. I will use it for Sonnet. I will probably break something. I will probably learn something. That is the cycle. That is the fun.

If you are training small models, give SPIN a try. It might not fix everything. It might not make your model smart. But it might make it slightly less confused. And sometimes slightly less confused is good enough.